Parallelizing Multiscale and Multigranular Spatial Data Mining Algorithms
نویسندگان
چکیده
Multiscale and Multigranular (MSMG) Spatial Data Mining (SDM) algorithms are used to find the best granular class label from a hierarchical set of granular class labels for spatial classification, which is important for many application domains including the military. However, it is computationally very expensive due to a complex quality measure for ranking class labels. In this paper we propose a parallel formulation of a MSMGSDM algorithm to scale up to the problem sizes of interest to the Army using the Partitioned Global Address Space (PGAS) model programmed in Unified Parallel C (UPC), which facilitates sharing of data among processors. Experimental evaluations for land cover classification from satellite imagery show that the proposed parallel formulation achieves speedup of 6.65 using 8 processors.
منابع مشابه
A Model for Multigranular Data and Its Integrity
Data involving spatial and/or temporal attributes are often represented at different levels of granularity in different source schemata. In this work, a model of such multigranular data is developed, which supports not only the usual order structure on granules, but also lattice-like join and disjointness operators for relating such granules in much more complex ways. In addition, a model for m...
متن کاملParallelizing Frequent Itemset Mining Process using High Performance Computing
Data is growing at an enormous rate and mining this data is becoming a herculean task. Association Rule mining is one of the important algorithms used in data mining and mining frequent itemset is a crucial step in this process which consumes most of the processing time. Parallelizing the algorithm at various levels of computation will not only speed up the process but will also allow it to han...
متن کاملIntegration Integrity for Multigranular Data
When data from several source schemata are to be integrated, it is essential that the resulting data in the global schema be consistent. This problem has been studied extensively for the monogranular case, in which all domains are flat. However, data involving spatial and/or temporal attributes are often represented at different levels of granularity in different source schemata. In this work, ...
متن کاملParallelizing frequent web access pattern mining with partial enumeration for high speedup
The maximum speedup of direct parallelization of pattern-growth mining algorithms for long sequences is limited by the load imbalance among the parallel tasks. In this paper, we present a scheme to parallelize pattern-growth mining algorithms using partial enumeration for high speedup. The experimental results show that partial enumeration increases the achievable speedup of parallel mining sig...
متن کاملImplementation of Parallelizing Multi-layer Neural Networks Based on Cloud Computing
Background: Cloud computing, as a technology developed under the rapid development of modern network, is mainly used for processing large-scale data. The traditional data mining algorithms such as neural network algorithm are usually used for processing small-scale data. Therefore, the calculation of large-scale data using neural network algorithm must be based on cloud computing. Materials and...
متن کامل